Skip to main content

Chat Models

OrganizationModel NameAPI Model StringContext lengthQuantization
OpenAIGPT OSS 120Bopenai/gpt-oss-120b128000MXFP4
OpenAIGPT OSS 20Bopenai/gpt-oss-20b128000MXFP4
DeepSeekDeepSeek R1 Distill Llama 70Bdeepseek-ai/deepseek-r1-distill-llama-70b65000FP16
DeepSeekDeepSeek R1 0528deepseek-ai/DeepSeek-R1-0528131072FP16
DeepSeekDeepSeek V3.2deepseek-ai/DeepSeek-V3.2131072FP16
Mistral AIMistral (7B) Instruct v0.3mistralai/Mistral-7B-Instruct-v0.332768FP16
NVIDIANemotron Orchestrator 8Bnvidia/Orchestrator-8B16384FP16
NVIDIANemotron 3 Nano 30Bnvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16262144BF16
NVIDIANemotron 3 Super 120B A12Bnvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8262144FP8
MicrosoftFara 7Bmicrosoft/Fara-7B8192FP16
MetaLlama 3.3 70B Instructmeta-llama/Llama-3.3-70B-Instruct8192FP16
QwenQwen3 PlusQwen/Qwen3-Plus131072FP16
QwenQwen3 Next 80B A3B ThinkingQwen/Qwen3-Next-80B-A3B-Thinking131072FP16
QwenQwen3 MaxQwen/Qwen3-Max131072FP16
Moonshot AIKimi K2 Thinkingmoonshotai/Kimi-K2-Thinking131072FP16
ZhipuAIGLM 4.7 FP8zai-org/GLM-4.7-FP8131072FP8
MiniMaxMiniMax M2.1MiniMaxAI/MiniMax-M2.1131072FP16

Code Models

OrganizationModel NameAPI Model StringContext lengthQuantization
QwenQwen3 Coder 30B A3B InstructQwen/Qwen3-Coder-30B-A3B-Instruct131000FP16
QwenQwen3 Coder PlusQwen/Qwen3-Coder-Plus
QwenQwen3 Coder NextQwen/Qwen3-Coder-Next
QwenQwen3 Coder FlashQwen/Qwen3-Coder-Flash
QwenQwen3 Coder 480B A35B InstructQwen/Qwen3-Coder-480B-A35B-Instruct

Image Models

OrganizationModel NameAPI Model StringModel TypeDefault steps
Pruna AIP-Imagep-imageImage Generation
Pruna AIP-Image LoRAp-image-loraImage Generation
Pruna AIP-Image Editp-image-editImage Edit
Pruna AIP-Image Edit LoRAp-image-edit-loraImage Edit
Qwen Tongyi MAIZ Image TurboTongyi-MAI/Z-Image-TurboImage Generation9
Stability AIStable Diffusion 3.5 Largestabilityai/stable-diffusion-3.5-largeImage Generation30
QwenQwen Image EditQwen/Qwen-Image-EditImage Edit20
Black Forest LabsFlux Devblack-forest-labs/flux-devImage Generation
Black Forest LabsFlux 2 Klein 4Bblack-forest-labs/flux-2-klein-4bImage Generation

Audio Models

OrganizationModalityModel NameAPI Model String
OpenAISpeech-to-TextWhisper Large v3openai/whisper-large-v3
QwenText-to-SpeechQwen3 TTS Flashqwen3-tts-flash

Video Models

OrganizationModel NameAPI Model StringMax DurationMax Resolution
Pruna AIP-Videop-video10 seconds1080p

OCR Models

OrganizationModel NameAPI Model StringContext length
TencentHunyuan OCR (1B)tencent/HunyuanOCR16000

Vision Models

OrganizationModel NameAPI Model StringContext length
QwenQwen3-VL 8B InstructQwen/Qwen3-VL-8B-Instruct32768
QwenQwen3-VL 30B A3B InstructQwen/Qwen3-VL-30B-A3B-Instruct128000
QwenQwen2.5-VL 7B InstructQwen/Qwen2.5-VL-7B-Instruct32768
QwenQwen3-VL PlusQwen/Qwen3-VL-Plus
QwenQwen3-VL FlashQwen/Qwen3-VL-Flash
QwenQwen3-VL 235B A22B InstructQwen/Qwen3-VL-235B-A22B-Instruct
QwenQwen3-VL 235B A22B ThinkingQwen/Qwen3-VL-235B-A22B-Thinking262144
QwenQwen3.5 397B A17BQwen/Qwen3.5-397B-A17B262144
QwenQwen3.5 122B A10BQwen/Qwen3.5-122B-A10B262144
QwenQwen3.5 27BQwen/Qwen3.5-27B262144
QwenQwen3.5 35B A3BQwen/Qwen3.5-35B-A3B262144
QwenQwen3.5 FlashQwen/Qwen3.5-Flash1048576
Moonshot AIKimi K2.5moonshotai/Kimi-K2.5262144

Embedding Models

Model NameAPI Model StringModel SizeEmbedding DimensionContext Window
BGE-Large-EN-v1.5BAAI/bge-large-en-v1.5326M1024512